Overview

Dataset statistics

Number of variables22
Number of observations344
Missing cells2752
Missing cells (%)36.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory82.1 KiB
Average record size in memory244.4 B

Variable types

Numeric10
Unsupported8
Categorical4

Warnings

operation_car has constant value "29.0" Constant
destination_esr is highly correlated with operation_st_esrHigh correlation
operation_st_esr is highly correlated with destination_esrHigh correlation
operation_date is highly correlated with operation_car and 1 other fieldsHigh correlation
rodvag is highly correlated with operation_carHigh correlation
operation_car is highly correlated with operation_date and 2 other fieldsHigh correlation
adm is highly correlated with operation_date and 1 other fieldsHigh correlation
index_train has 344 (100.0%) missing values Missing
danger has 344 (100.0%) missing values Missing
loaded has 344 (100.0%) missing values Missing
operation_train has 344 (100.0%) missing values Missing
rod_train has 344 (100.0%) missing values Missing
ssp_station_esr has 344 (100.0%) missing values Missing
ssp_station_id has 344 (100.0%) missing values Missing
weight_brutto has 344 (100.0%) missing values Missing
df_index has unique values Unique
index_train is an unsupported type, check if it needs cleaning or further analysis Unsupported
danger is an unsupported type, check if it needs cleaning or further analysis Unsupported
loaded is an unsupported type, check if it needs cleaning or further analysis Unsupported
operation_train is an unsupported type, check if it needs cleaning or further analysis Unsupported
rod_train is an unsupported type, check if it needs cleaning or further analysis Unsupported
ssp_station_esr is an unsupported type, check if it needs cleaning or further analysis Unsupported
ssp_station_id is an unsupported type, check if it needs cleaning or further analysis Unsupported
weight_brutto is an unsupported type, check if it needs cleaning or further analysis Unsupported
receiver has 17 (4.9%) zeros Zeros
sender has 18 (5.2%) zeros Zeros

Reproduction

Analysis started2021-04-16 09:05:16.524724
Analysis finished2021-04-16 09:05:35.818698
Duration19.29 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct344
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2233827.663
Minimum64867
Maximum4043546
Zeros0
Zeros (%)0.0%
Memory size2.8 KiB
2021-04-16T15:05:35.974667image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum64867
5-th percentile72016.05
Q11215677.5
median2392990
Q33880534.75
95-th percentile4039605.25
Maximum4043546
Range3978679
Interquartile range (IQR)2664857.25

Descriptive statistics

Standard deviation1339825.287
Coefficient of variation (CV)0.5997890121
Kurtosis-1.107076368
Mean2233827.663
Median Absolute Deviation (MAD)1187245
Skewness-0.1572587093
Sum768436716
Variance1.7951318 × 1012
MonotocityStrictly increasing
2021-04-16T15:05:36.148667image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
716801
 
0.3%
735681
 
0.3%
40396061
 
0.3%
34534171
 
0.3%
38824001
 
0.3%
38808071
 
0.3%
735721
 
0.3%
648671
 
0.3%
735701
 
0.3%
19250081
 
0.3%
Other values (334)334
97.1%
ValueCountFrequency (%)
648671
0.3%
712141
0.3%
712821
0.3%
712881
0.3%
716801
0.3%
716841
0.3%
716861
0.3%
716881
0.3%
716901
0.3%
716961
0.3%
ValueCountFrequency (%)
40435461
0.3%
40435401
0.3%
40433341
0.3%
40433301
0.3%
40433031
0.3%
40432961
0.3%
40432891
0.3%
40432821
0.3%
40432231
0.3%
40417611
0.3%

index_train
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing344
Missing (%)100.0%
Memory size2.8 KiB

length
Real number (ℝ≥0)

Distinct10
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8820930233
Minimum0.79
Maximum1.26
Zeros0
Zeros (%)0.0%
Memory size2.8 KiB
2021-04-16T15:05:36.319699image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0.79
5-th percentile0.83
Q10.83
median0.83
Q30.83
95-th percentile1.06
Maximum1.26
Range0.47
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.1147299151
Coefficient of variation (CV)0.130065551
Kurtosis3.164683975
Mean0.8820930233
Median Absolute Deviation (MAD)0
Skewness2.037784078
Sum303.44
Variance0.01316295342
MonotocityNot monotonic
2021-04-16T15:05:36.446699image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.83265
77.0%
1.0641
 
11.9%
1.2613
 
3.8%
0.799
 
2.6%
17
 
2.0%
1.033
 
0.9%
0.862
 
0.6%
1.252
 
0.6%
1.011
 
0.3%
1.221
 
0.3%
ValueCountFrequency (%)
0.799
 
2.6%
0.83265
77.0%
0.862
 
0.6%
17
 
2.0%
1.011
 
0.3%
1.033
 
0.9%
1.0641
 
11.9%
1.221
 
0.3%
1.252
 
0.6%
1.2613
 
3.8%
ValueCountFrequency (%)
1.2613
 
3.8%
1.252
 
0.6%
1.221
 
0.3%
1.0641
 
11.9%
1.033
 
0.9%
1.011
 
0.3%
17
 
2.0%
0.862
 
0.6%
0.83265
77.0%
0.799
 
2.6%

car_number
Real number (ℝ≥0)

Distinct291
Distinct (%)84.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33081874.29
Minimum30000491
Maximum64046667
Zeros0
Zeros (%)0.0%
Memory size2.8 KiB
2021-04-16T15:05:36.585702image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum30000491
5-th percentile30075811.55
Q130695415
median30848452.5
Q330883327.25
95-th percentile42255982.8
Maximum64046667
Range34046176
Interquartile range (IQR)187912.25

Descriptive statistics

Standard deviation6006289.835
Coefficient of variation (CV)0.1815583296
Kurtosis10.18321082
Mean33081874.29
Median Absolute Deviation (MAD)42509
Skewness3.04798723
Sum1.138016476 × 1010
Variance3.607551759 × 1013
MonotocityNot monotonic
2021-04-16T15:05:36.775705image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
308404092
 
0.6%
308533602
 
0.6%
308531882
 
0.6%
302519952
 
0.6%
308174722
 
0.6%
308145372
 
0.6%
308405162
 
0.6%
308404662
 
0.6%
306840702
 
0.6%
302533972
 
0.6%
Other values (281)324
94.2%
ValueCountFrequency (%)
300004911
0.3%
300087911
0.3%
300098981
0.3%
300133951
0.3%
300136921
0.3%
300170991
0.3%
300176931
0.3%
300189981
0.3%
300193921
0.3%
300196991
0.3%
ValueCountFrequency (%)
640466671
0.3%
638413081
0.3%
631449271
0.3%
612746431
0.3%
603020231
0.3%
589604281
0.3%
589603111
0.3%
568469751
0.3%
568386341
0.3%
525648871
0.3%

destination_esr
Real number (ℝ≥0)

HIGH CORRELATION

Distinct8
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean915003.2035
Minimum843200
Maximum988109
Zeros0
Zeros (%)0.0%
Memory size2.8 KiB
2021-04-16T15:05:36.920667image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum843200
5-th percentile843200
Q1904705
median904705
Q3918407
95-th percentile988109
Maximum988109
Range144909
Interquartile range (IQR)13702

Descriptive statistics

Standard deviation33388.80259
Coefficient of variation (CV)0.03649036688
Kurtosis1.247045576
Mean915003.2035
Median Absolute Deviation (MAD)13702
Skewness0.4380156003
Sum314761102
Variance1114812138
MonotocityNot monotonic
2021-04-16T15:05:37.045699image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
904705159
46.2%
91840798
28.5%
98810938
 
11.0%
84320020
 
5.8%
94680117
 
4.9%
85300510
 
2.9%
9648091
 
0.3%
9065031
 
0.3%
ValueCountFrequency (%)
84320020
 
5.8%
85300510
 
2.9%
904705159
46.2%
9065031
 
0.3%
91840798
28.5%
94680117
 
4.9%
9648091
 
0.3%
98810938
 
11.0%
ValueCountFrequency (%)
98810938
 
11.0%
9648091
 
0.3%
94680117
 
4.9%
91840798
28.5%
9065031
 
0.3%
904705159
46.2%
85300510
 
2.9%
84320020
 
5.8%

adm
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
20.0
327 
33.0
 
17

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1376
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20.0
2nd row20.0
3rd row20.0
4th row20.0
5th row20.0
ValueCountFrequency (%)
20.0327
95.1%
33.017
 
4.9%
2021-04-16T15:05:37.345664image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-16T15:05:37.462667image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
20.0327
95.1%
33.017
 
4.9%

Most occurring characters

ValueCountFrequency (%)
0671
48.8%
.344
25.0%
2327
23.8%
334
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1032
75.0%
Other Punctuation344
 
25.0%

Most frequent character per category

ValueCountFrequency (%)
0671
65.0%
2327
31.7%
334
 
3.3%
ValueCountFrequency (%)
.344
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1376
100.0%

Most frequent character per script

ValueCountFrequency (%)
0671
48.8%
.344
25.0%
2327
23.8%
334
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1376
100.0%

Most frequent character per block

ValueCountFrequency (%)
0671
48.8%
.344
25.0%
2327
23.8%
334
 
2.5%

danger
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing344
Missing (%)100.0%
Memory size2.8 KiB

gruz
Real number (ℝ≥0)

Distinct13
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean254680.6831
Minimum233010
Maximum999993
Zeros0
Zeros (%)0.0%
Memory size2.8 KiB
2021-04-16T15:05:37.555663image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum233010
5-th percentile236038
Q1236038
median236038
Q3236038
95-th percentile321067
Maximum999993
Range766983
Interquartile range (IQR)0

Descriptive statistics

Standard deviation56329.7644
Coefficient of variation (CV)0.221178001
Kurtosis91.32038154
Mean254680.6831
Median Absolute Deviation (MAD)0
Skewness7.757277469
Sum87610155
Variance3173042357
MonotocityNot monotonic
2021-04-16T15:05:37.692665image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
236038274
79.7%
32106740
 
11.6%
30306912
 
3.5%
2421287
 
2.0%
4211612
 
0.6%
2330102
 
0.6%
3513061
 
0.3%
5421881
 
0.3%
4111551
 
0.3%
4112631
 
0.3%
Other values (3)3
 
0.9%
ValueCountFrequency (%)
2330102
 
0.6%
236038274
79.7%
2421287
 
2.0%
3020321
 
0.3%
30306912
 
3.5%
32106740
 
11.6%
3513061
 
0.3%
4111551
 
0.3%
4112631
 
0.3%
4211612
 
0.6%
ValueCountFrequency (%)
9999931
 
0.3%
5421881
 
0.3%
4350601
 
0.3%
4211612
 
0.6%
4112631
 
0.3%
4111551
 
0.3%
3513061
 
0.3%
32106740
11.6%
30306912
 
3.5%
3020321
 
0.3%

loaded
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing344
Missing (%)100.0%
Memory size2.8 KiB

operation_car
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
29.0
344 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1376
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row29.0
2nd row29.0
3rd row29.0
4th row29.0
5th row29.0
ValueCountFrequency (%)
29.0344
100.0%
2021-04-16T15:05:37.947700image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-16T15:05:38.026699image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
29.0344
100.0%

Most occurring characters

ValueCountFrequency (%)
2344
25.0%
9344
25.0%
.344
25.0%
0344
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1032
75.0%
Other Punctuation344
 
25.0%

Most frequent character per category

ValueCountFrequency (%)
2344
33.3%
9344
33.3%
0344
33.3%
ValueCountFrequency (%)
.344
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1376
100.0%

Most frequent character per script

ValueCountFrequency (%)
2344
25.0%
9344
25.0%
.344
25.0%
0344
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1376
100.0%

Most frequent character per block

ValueCountFrequency (%)
2344
25.0%
9344
25.0%
.344
25.0%
0344
25.0%

operation_date
Categorical

HIGH CORRELATION

Distinct21
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Memory size25.7 KiB
2020-07-29 05:05:00
53 
2020-07-17 05:03:00
53 
2020-07-14 13:55:00
38 
2020-07-15 22:14:00
38 
2020-07-27 17:58:00
37 
Other values (16)
125 

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters6536
Distinct characters12
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)1.2%

Sample

1st row2020-07-17 05:03:00
2nd row2020-07-17 05:03:00
3rd row2020-07-17 05:03:00
4th row2020-07-17 05:03:00
5th row2020-07-17 05:03:00
ValueCountFrequency (%)
2020-07-29 05:05:0053
15.4%
2020-07-17 05:03:0053
15.4%
2020-07-14 13:55:0038
11.0%
2020-07-15 22:14:0038
11.0%
2020-07-27 17:58:0037
10.8%
2020-07-25 18:53:0022
6.4%
2020-07-20 12:10:0020
 
5.8%
2020-07-30 17:05:0020
 
5.8%
2020-07-23 13:00:0016
 
4.7%
2020-07-09 05:49:0014
 
4.1%
Other values (11)33
9.6%
2021-04-16T15:05:38.271703image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2020-07-2953
 
7.7%
05:05:0053
 
7.7%
05:03:0053
 
7.7%
2020-07-1753
 
7.7%
2020-07-1453
 
7.7%
2020-07-2740
 
5.8%
22:14:0038
 
5.5%
2020-07-1538
 
5.5%
13:55:0038
 
5.5%
17:58:0037
 
5.4%
Other values (25)232
33.7%

Most occurring characters

ValueCountFrequency (%)
02101
32.1%
2960
14.7%
-688
 
10.5%
:688
 
10.5%
7494
 
7.6%
5416
 
6.4%
1393
 
6.0%
344
 
5.3%
3194
 
3.0%
4112
 
1.7%
Other values (2)146
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number4816
73.7%
Dash Punctuation688
 
10.5%
Other Punctuation688
 
10.5%
Space Separator344
 
5.3%

Most frequent character per category

ValueCountFrequency (%)
02101
43.6%
2960
19.9%
7494
 
10.3%
5416
 
8.6%
1393
 
8.2%
3194
 
4.0%
4112
 
2.3%
981
 
1.7%
865
 
1.3%
ValueCountFrequency (%)
-688
100.0%
ValueCountFrequency (%)
344
100.0%
ValueCountFrequency (%)
:688
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common6536
100.0%

Most frequent character per script

ValueCountFrequency (%)
02101
32.1%
2960
14.7%
-688
 
10.5%
:688
 
10.5%
7494
 
7.6%
5416
 
6.4%
1393
 
6.0%
344
 
5.3%
3194
 
3.0%
4112
 
1.7%
Other values (2)146
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII6536
100.0%

Most frequent character per block

ValueCountFrequency (%)
02101
32.1%
2960
14.7%
-688
 
10.5%
:688
 
10.5%
7494
 
7.6%
5416
 
6.4%
1393
 
6.0%
344
 
5.3%
3194
 
3.0%
4112
 
1.7%
Other values (2)146
 
2.2%

operation_st_esr
Real number (ℝ≥0)

HIGH CORRELATION

Distinct8
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean915003.2035
Minimum843200
Maximum988109
Zeros0
Zeros (%)0.0%
Memory size2.8 KiB
2021-04-16T15:05:38.381700image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum843200
5-th percentile843200
Q1904705
median904705
Q3918407
95-th percentile988109
Maximum988109
Range144909
Interquartile range (IQR)13702

Descriptive statistics

Standard deviation33388.80259
Coefficient of variation (CV)0.03649036688
Kurtosis1.247045576
Mean915003.2035
Median Absolute Deviation (MAD)13702
Skewness0.4380156003
Sum314761102
Variance1114812138
MonotocityNot monotonic
2021-04-16T15:05:38.493704image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
904705159
46.2%
91840798
28.5%
98810938
 
11.0%
84320020
 
5.8%
94680117
 
4.9%
85300510
 
2.9%
9648091
 
0.3%
9065031
 
0.3%
ValueCountFrequency (%)
84320020
 
5.8%
85300510
 
2.9%
904705159
46.2%
9065031
 
0.3%
91840798
28.5%
94680117
 
4.9%
9648091
 
0.3%
98810938
 
11.0%
ValueCountFrequency (%)
98810938
 
11.0%
9648091
 
0.3%
94680117
 
4.9%
91840798
28.5%
9065031
 
0.3%
904705159
46.2%
85300510
 
2.9%
84320020
 
5.8%

operation_st_id
Real number (ℝ≥0)

Distinct8
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2000201913
Minimum2000036238
Maximum2001930738
Zeros0
Zeros (%)0.0%
Memory size2.8 KiB
2021-04-16T15:05:38.667697image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum2000036238
5-th percentile2000036238
Q12000036238
median2000036472
Q32000036820
95-th percentile2001930660
Maximum2001930738
Range1894500
Interquartile range (IQR)581.5

Descriptive statistics

Standard deviation535138.8186
Coefficient of variation (CV)0.0002675423991
Kurtosis6.676191807
Mean2000201913
Median Absolute Deviation (MAD)234
Skewness2.938941154
Sum6.880694582 × 1011
Variance2.863735552 × 1011
MonotocityNot monotonic
2021-04-16T15:05:38.796662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2000036238159
46.2%
200003647298
28.5%
200003902838
 
11.0%
200193066020
 
5.8%
200003786217
 
4.9%
200193073810
 
2.9%
20000385101
 
0.3%
20000362741
 
0.3%
ValueCountFrequency (%)
2000036238159
46.2%
20000362741
 
0.3%
200003647298
28.5%
200003786217
 
4.9%
20000385101
 
0.3%
200003902838
 
11.0%
200193066020
 
5.8%
200193073810
 
2.9%
ValueCountFrequency (%)
200193073810
 
2.9%
200193066020
 
5.8%
200003902838
 
11.0%
20000385101
 
0.3%
200003786217
 
4.9%
200003647298
28.5%
20000362741
 
0.3%
2000036238159
46.2%

operation_train
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing344
Missing (%)100.0%
Memory size2.8 KiB

receiver
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17083387.85
Minimum0
Maximum79311499
Zeros17
Zeros (%)4.9%
Memory size2.8 KiB
2021-04-16T15:05:38.992667image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile11866783
Q114999355
median14999355
Q314999355
95-th percentile58786880
Maximum79311499
Range79311499
Interquartile range (IQR)0

Descriptive statistics

Standard deviation11799436.96
Coefficient of variation (CV)0.6906965443
Kurtosis9.828865565
Mean17083387.85
Median Absolute Deviation (MAD)0
Skewness3.105661416
Sum5876685422
Variance1.392267125 × 1014
MonotocityNot monotonic
2021-04-16T15:05:39.165666image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
14999355296
86.0%
5878688020
 
5.8%
017
 
4.9%
118667837
 
2.0%
150826012
 
0.6%
685945601
 
0.3%
793114991
 
0.3%
ValueCountFrequency (%)
017
 
4.9%
118667837
 
2.0%
14999355296
86.0%
150826012
 
0.6%
5878688020
 
5.8%
685945601
 
0.3%
793114991
 
0.3%
ValueCountFrequency (%)
793114991
 
0.3%
685945601
 
0.3%
5878688020
 
5.8%
150826012
 
0.6%
14999355296
86.0%
118667837
 
2.0%
017
 
4.9%

rodvag
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
90.0
293 
40.0
41 
60.0
 
7
93.0
 
2
20.0
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1376
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row90.0
2nd row90.0
3rd row90.0
4th row90.0
5th row90.0
ValueCountFrequency (%)
90.0293
85.2%
40.041
 
11.9%
60.07
 
2.0%
93.02
 
0.6%
20.01
 
0.3%
2021-04-16T15:05:39.589669image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-16T15:05:39.696700image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
90.0293
85.2%
40.041
 
11.9%
60.07
 
2.0%
93.02
 
0.6%
20.01
 
0.3%

Most occurring characters

ValueCountFrequency (%)
0686
49.9%
.344
25.0%
9295
21.4%
441
 
3.0%
67
 
0.5%
32
 
0.1%
21
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1032
75.0%
Other Punctuation344
 
25.0%

Most frequent character per category

ValueCountFrequency (%)
0686
66.5%
9295
28.6%
441
 
4.0%
67
 
0.7%
32
 
0.2%
21
 
0.1%
ValueCountFrequency (%)
.344
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1376
100.0%

Most frequent character per script

ValueCountFrequency (%)
0686
49.9%
.344
25.0%
9295
21.4%
441
 
3.0%
67
 
0.5%
32
 
0.1%
21
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1376
100.0%

Most frequent character per block

ValueCountFrequency (%)
0686
49.9%
.344
25.0%
9295
21.4%
441
 
3.0%
67
 
0.5%
32
 
0.1%
21
 
0.1%

rod_train
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing344
Missing (%)100.0%
Memory size2.8 KiB

sender
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14763184.75
Minimum0
Maximum58786880
Zeros18
Zeros (%)5.2%
Memory size2.8 KiB
2021-04-16T15:05:39.794700image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile710091.9
Q14733946
median14999355
Q314999355
95-th percentile58786880
Maximum58786880
Range58786880
Interquartile range (IQR)10265409

Descriptive statistics

Standard deviation14251262.81
Coefficient of variation (CV)0.9653244237
Kurtosis4.439173207
Mean14763184.75
Median Absolute Deviation (MAD)0
Skewness2.230435416
Sum5078535554
Variance2.030984917 × 1014
MonotocityNot monotonic
2021-04-16T15:05:39.908662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
14999355190
55.2%
4733946106
30.8%
5878688020
 
5.8%
018
 
5.2%
555458967
 
2.0%
522551812
 
0.6%
577905941
 
0.3%
ValueCountFrequency (%)
018
 
5.2%
4733946106
30.8%
14999355190
55.2%
522551812
 
0.6%
555458967
 
2.0%
577905941
 
0.3%
5878688020
 
5.8%
ValueCountFrequency (%)
5878688020
 
5.8%
577905941
 
0.3%
555458967
 
2.0%
522551812
 
0.6%
14999355190
55.2%
4733946106
30.8%
018
 
5.2%

ssp_station_esr
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing344
Missing (%)100.0%
Memory size2.8 KiB

ssp_station_id
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing344
Missing (%)100.0%
Memory size2.8 KiB

tare_weight
Real number (ℝ≥0)

Distinct21
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean230.9680233
Minimum184
Maximum550
Zeros0
Zeros (%)0.0%
Memory size2.8 KiB
2021-04-16T15:05:40.058662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum184
5-th percentile184
Q1230
median232
Q3237
95-th percentile240
Maximum550
Range366
Interquartile range (IQR)7

Descriptive statistics

Standard deviation30.28940333
Coefficient of variation (CV)0.1311411117
Kurtosis70.74989188
Mean230.9680233
Median Absolute Deviation (MAD)3
Skewness6.530775486
Sum79453
Variance917.4479541
MonotocityNot monotonic
2021-04-16T15:05:40.188666image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
23290
26.2%
23762
18.0%
23062
18.0%
18436
 
10.5%
24029
 
8.4%
23319
 
5.5%
27013
 
3.8%
2385
 
1.5%
2105
 
1.5%
2254
 
1.2%
Other values (11)19
 
5.5%
ValueCountFrequency (%)
18436
10.5%
1932
 
0.6%
2105
 
1.5%
2201
 
0.3%
2223
 
0.9%
2231
 
0.3%
2254
 
1.2%
2262
 
0.6%
2271
 
0.3%
23062
18.0%
ValueCountFrequency (%)
5502
 
0.6%
27013
 
3.8%
2591
 
0.3%
2421
 
0.3%
24029
8.4%
2392
 
0.6%
2385
 
1.5%
23762
18.0%
2353
 
0.9%
23319
 
5.5%

weight_brutto
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing344
Missing (%)100.0%
Memory size2.8 KiB

Interactions

2021-04-16T15:05:17.516950image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:17.664950image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:18.313953image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:18.534956image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:18.697955image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:18.850992image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:19.021955image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:19.171987image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:19.324954image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:19.496951image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:19.644953image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:19.781987image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:19.915991image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:20.067955image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:20.206987image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:20.345949image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:20.481955image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:20.611986image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:20.776951image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:20.980991image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:21.125986image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:21.278988image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:21.424989image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:21.581950image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:21.737987image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:21.885950image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:22.134990image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:22.286987image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:22.452956image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:22.604951image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:22.770530image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:22.932499image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:23.095495image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:23.462496image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:23.649534image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:23.789531image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:23.935494image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:24.095503image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:24.268495image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:24.428494image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:24.591495image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:24.747661image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:24.903668image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:25.052662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:25.209663image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:25.358699image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:25.506711image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:25.649704image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:25.793705image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:26.011663image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:26.201661image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:26.364701image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:26.508662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:26.666664image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:26.814700image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:27.144661image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:27.313667image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:27.551666image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:27.859663image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:28.070664image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:28.423666image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:28.839664image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:29.092663image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:29.328665image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:29.539662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:29.738663image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:29.954663image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:30.120665image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:30.380664image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:30.544664image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:30.742704image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:30.882666image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:31.071666image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:31.235699image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:31.360665image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:31.521706image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:31.672662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:31.891661image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:32.046704image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:32.191662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:32.378665image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:32.660663image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:32.920662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:33.170665image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:33.422664image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:33.713662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:33.959662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:34.183664image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:34.335699image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-16T15:05:34.476666image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Correlations

2021-04-16T15:05:40.366696image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-04-16T15:05:40.700660image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-04-16T15:05:41.223700image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-04-16T15:05:41.535698image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-04-16T15:05:41.742662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-04-16T15:05:34.940711image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
A simple visualization of nullity by column.
2021-04-16T15:05:35.417700image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-04-16T15:05:35.625710image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexindex_trainlengthcar_numberdestination_esradmdangergruzloadedoperation_caroperation_dateoperation_st_esroperation_st_idoperation_trainreceiverrodvagrod_trainsenderssp_station_esrssp_station_idtare_weightweight_brutto
064867NaN0.8330842983904705.020.0NaN236038.0NaN29.02020-07-17 05:03:00904705.02.000036e+09NaN14999355.090.0NaN4733946.0NaNNaN232.0NaN
171214NaN0.8330817472904705.020.0NaN236038.0NaN29.02020-07-17 05:03:00904705.02.000036e+09NaN14999355.090.0NaN4733946.0NaNNaN240.0NaN
271282NaN0.8330824577904705.020.0NaN236038.0NaN29.02020-07-17 05:03:00904705.02.000036e+09NaN14999355.090.0NaN4733946.0NaNNaN235.0NaN
371288NaN0.8330824825904705.020.0NaN236038.0NaN29.02020-07-17 05:03:00904705.02.000036e+09NaN14999355.090.0NaN4733946.0NaNNaN238.0NaN
471680NaN0.8330840409904705.020.0NaN236038.0NaN29.02020-07-17 05:03:00904705.02.000036e+09NaN14999355.090.0NaN4733946.0NaNNaN233.0NaN
571684NaN0.8330840508904705.020.0NaN236038.0NaN29.02020-07-17 05:03:00904705.02.000036e+09NaN14999355.090.0NaN4733946.0NaNNaN233.0NaN
671686NaN0.8330840466904705.020.0NaN236038.0NaN29.02020-07-17 05:03:00904705.02.000036e+09NaN14999355.090.0NaN4733946.0NaNNaN233.0NaN
771688NaN0.8330840607904705.020.0NaN236038.0NaN29.02020-07-17 05:03:00904705.02.000036e+09NaN14999355.090.0NaN4733946.0NaNNaN233.0NaN
871690NaN0.8330840516904705.020.0NaN236038.0NaN29.02020-07-17 05:03:00904705.02.000036e+09NaN14999355.090.0NaN4733946.0NaNNaN233.0NaN
971696NaN0.8330848378904705.020.0NaN236038.0NaN29.02020-07-17 05:03:00904705.02.000036e+09NaN14999355.090.0NaN4733946.0NaNNaN230.0NaN

Last rows

df_indexindex_trainlengthcar_numberdestination_esradmdangergruzloadedoperation_caroperation_dateoperation_st_esroperation_st_idoperation_trainreceiverrodvagrod_trainsenderssp_station_esrssp_station_idtare_weightweight_brutto
3344041761NaN0.8330813661918407.020.0NaN236038.0NaN29.02020-07-15 22:14:00918407.02.000036e+09NaN14999355.090.0NaN14999355.0NaNNaN235.0NaN
3354043223NaN0.8330885602918407.020.0NaN236038.0NaN29.02020-07-15 22:14:00918407.02.000036e+09NaN14999355.090.0NaN14999355.0NaNNaN232.0NaN
3364043282NaN0.8330883409918407.020.0NaN236038.0NaN29.02020-07-15 22:14:00918407.02.000036e+09NaN14999355.090.0NaN14999355.0NaNNaN232.0NaN
3374043289NaN0.8330883300918407.020.0NaN236038.0NaN29.02020-07-15 22:14:00918407.02.000036e+09NaN14999355.090.0NaN14999355.0NaNNaN232.0NaN
3384043296NaN0.8330883433918407.020.0NaN236038.0NaN29.02020-07-15 22:14:00918407.02.000036e+09NaN14999355.090.0NaN14999355.0NaNNaN232.0NaN
3394043303NaN0.8330883425918407.020.0NaN236038.0NaN29.02020-07-15 22:14:00918407.02.000036e+09NaN14999355.090.0NaN14999355.0NaNNaN232.0NaN
3404043330NaN0.8330891287918407.020.0NaN236038.0NaN29.02020-07-15 22:14:00918407.02.000036e+09NaN14999355.090.0NaN14999355.0NaNNaN232.0NaN
3414043334NaN0.8330891154918407.020.0NaN236038.0NaN29.02020-07-15 22:14:00918407.02.000036e+09NaN14999355.090.0NaN14999355.0NaNNaN232.0NaN
3424043540NaN0.8330891352918407.020.0NaN236038.0NaN29.02020-07-15 22:14:00918407.02.000036e+09NaN14999355.090.0NaN14999355.0NaNNaN232.0NaN
3434043546NaN0.8330891303918407.020.0NaN236038.0NaN29.02020-07-15 22:14:00918407.02.000036e+09NaN14999355.090.0NaN14999355.0NaNNaN232.0NaN